This tutorial was originally published on DataCareer.
In financial markets, tradable instruments and securities have unique identifiers. The identifiers are very useful, because you can make sure that you and your counterparty are talking about the same instrument while trading. The difficulty is that there isn't really a standard for all the various sorts of instruments or markets. Anyone working in the industry will recognize this issue, especially people working at larger institutions who deal with a variety of instruments. Products like equities, bonds (fixed income), indices, derivatives, currencies and structured products all have their own conventions. Just to name a few identifiers, you may come across:
Fortunately, the problem is recognized and there are steps being made to tackle this. Bloomberg has initiated OpenFigi and Refinitiv (formerly Thomson Reuters) PermID. Basically, they are open sourcing their proprietary identifiers. This makes it easier to map instruments to other identifiers which you may use.
Please take a look at the following websites, before continuing this tutorial:
For OpenFIGI we will use some of the example code which is published at: https://github.com/OpenFIGI/api-examples/blob/master/python/example-with-requests.py. @Jaret, thanks for sharing!
As always, let's start with importing some packages:
import requests # 2.19.1
import json # 2.0.9
import pandas as pd # 0.23.4
import pprint
We will get the identifiers for a handful of companies listed in the DAX (Deutscher Aktienindex) in Germany. It is a blue chip stock market index consisting of the 30 major German companies trading on the Frankfurt Stock Exchange. We randomly chose 5 tickers, feel free to change the list or add any you may like.
tickers = ['ADS', 'BAS', 'DTE', 'SAP', 'SIE'] # Adidas, BASF, Deutsche Telekom, SAP & Siemens
The Market Identifier Code (MIC) for Xetra in Frankfurt Germany is XETR
. If we specify this in our mapping requests, we have a better chance of getting the right identifiers back. Because some tickers my refer to other companies somewhere else in the world. For clarity, we write every mapping job out fully. For bigger lists, it is probably better to use a loop.
jobs = [
{'idType': 'TICKER', 'idValue': 'ADS', 'micCode': 'XETR'},
{'idType': 'TICKER', 'idValue': 'BAS', 'micCode': 'XETR'},
{'idType': 'TICKER', 'idValue': 'DTE', 'micCode': 'XETR'},
{'idType': 'TICKER', 'idValue': 'SAP', 'micCode': 'XETR'},
{'idType': 'TICKER', 'idValue': 'SIE', 'micCode': 'XETR'}
]
# Alternatively, if you start with ISIN codes:
# jobs = [{'idType': 'ID_ISIN', 'idValue': 'DE0005557508', 'micCode': 'XETR'}]
Now we have defined the jobs, we need to set some of the configuration and define the function we can use for the jobs. You don't necessarily need an API key for FIGI, but with one you will have higher rate limits.
openfigi_apikey = '' # Please put your own API Key here
openfigi_url = 'https://api.openfigi.com/v2/mapping'
openfigi_headers = {'Content-Type': 'text/json'}
def map_jobs(jobs):
if openfigi_apikey:
openfigi_headers['X-OPENFIGI-APIKEY'] = openfigi_apikey
response = requests.post(url=openfigi_url, headers=openfigi_headers,
json=jobs)
if response.status_code != 200:
raise Exception('Bad response code {}'.format(str(response.status_code)))
return response.json()
Now let's execute the function and print the results to the screen.
job_results = map_jobs(jobs)
pprint.pprint(job_results)
As you can see, the result of the mapping job is a list of dictionaries. For every ticker, the 'key' is called "data". Let's change the structure a little bit, so we can easily convert it to a Pandas DataFrame
. We don't need the keys named 'data', just the list of their values.
just_dictionaries = [d['data'][0] for d in job_results]
With this new list of dictionaries, we can directly convert it:
df_figi = pd.DataFrame.from_dict(just_dictionaries)
df_figi
The DataFrame actually contains a little bit more information than we need for this tutorial. We can filter the interesting columns and set the tickers to be the index.
# Columns of interest
columns = ['ticker', 'name', 'marketSector', 'figi', 'uniqueID']
# Filter out the columns of interest
df_figi = df_figi[columns]
# Set the tickers to be the index
df_figi = df_figi.set_index('ticker')
# Show the DataFrame
df_figi
Excellent. We now have mapped tickers to their respective FIGIs. Time for the next step: PermIDs.
PermID is short for "permanent identifier". It is the open sourced identifier system of Refinitiv (formerly Thomson Reuters). You do need to register at the Developer Portal to be able to use their APIs. In this tutorial, we will use the Record Matching API
to map tickers to their respective PermID's. You can find a 'quick start' at: https://developers.refinitiv.com/en/api-catalog/open-perm-id/permid-record-matching-restful-api/quick-start.
NB: At the time of writing, it is not possible to use ISINs as input nor to get RICs as output. Hopefully this will be added to the API's functionality in the future.
Please fill in you own access token before continuing:
# Your own access token
access_token = ''
Plus we define the configuration:
# API endpoint
request_url = "https://api.thomsonreuters.com/permid/match"
headers = {
'Content-Type': 'text/plain',
'Accept': 'application/json',
'x-ag-access-token': access_token,
'x-openmatch-numberOfMatchesPerRecord': '1', # only return 1 match per ticker
'x-openmatch-dataType': 'Organization', # only match to "organizations", not "persons" or other tags
}
Now it gets a little bit more tricky. As you can read in the documentation, the API excepts a CSV file or "specially formatted text" as input. We are working with a list of tickers in this tutorial, so we need to construct the "specially formatted text" ourselves.
# The first line in the text field is 'Standard Identifier'. We use 'Ticker' as identifier in this tutorial
text_field = 'Standard Identifier\n'
# For every ticker, we will add a new line and specify the Market Identifier Code (MIC) / Exchange
exchange = 'XETR'
for ticker in tickers:
identifier = 'TICKER:' + ticker + '&&MIC:' + exchange + '\n'
text_field += identifier
# Print the 'text_field' to screen for inspection
print(text_field)
Now it is time to construct & make the actual request.
response = requests.post(request_url, headers=headers, data=text_field)
r = response.json()
The information are looking for is in 'outputContentResponse'. We can print it to the screen for inspection.
# Pretty print the response to the screen
pprint.pprint(r['outputContentResponse'])
Let's quickly loop through the results to check out the PermID's.
for company in r['outputContentResponse']:
print(company['Match OrgName'] + ' --> ' + company['Match OpenPermID'])
Did you check out any of the links? PermID offers some additional data besides the identifier. Let's define a function which will help us to retrieve the additional data.
def permid_data(permid_url):
permid_headers = {
'Accept': 'text/turtle',
}
permid_params = {
'format': 'json-ld',
'access-token': access_token
}
# The actual request
permid_response = requests.get(permid_url, headers=headers, params=permid_params)
# Convert the response to JSON
permid_data = json.loads(permid_response.content)
return permid_data
Now we will create a dictionary of dictionaries, one for every ticker we are interested in. We do this so we can then later convert it easily to a Pandas DataFrame
. So, we use a loop to go through each company in the list of the tickers to:
# Create an empty dictionary
permid_dict = {}
# Loop through all tickers and put the data in the dictionary
for ticker, i in zip(tickers, r['outputContentResponse']):
# The PermID url for the ticker from the response earlier
permid_url = i['Match OpenPermID']
# Use the function defined above to download the data
data = permid_data(permid_url)
# Put the desired data in a dictionary for the ticker
permid_dict[ticker] = {
'company': data['vcard:organization-name'],
'IPO' : data['hasIPODate'],
'address': data['mdaas:HeadquartersAddress'],
'website': data['hasURL'],
'phone' : data['tr-org:hasHeadquartersPhoneNumber'],
'LEI' : data['tr-org:hasLEI'],
'permid' : data['tr-common:hasPermId'],
'permid_url' : permid_url
}
All the data is now in a dictionary of dictionaries. Again, it is also very easy to convert this to a Pandas DataFrame
.
df_permid = pd.DataFrame.from_dict(permid_dict, orient='index') # Orient='index' for data in rows instead of columns
df_permid
As a final touch, let's join both DataFrames:
df_final = df_figi.join(df_permid)
df_final
We started with only the tickers for a handful of companies. Finally, we have for every ticker the full company name, FIGI, PermID, Legal Entity Identifier (LEI), initial public offering date, physical address, website and phone number.
Not bad, right?